CS 545 : Assignment 8 Dan

نویسنده

  • Dan Elliott
چکیده

Mixture of Probabilistic Principal Component Analyzers (MPPCA) is a seminal work in Machine Learning in that it was the first to use PCA to perform clustering and local dimensionality reduction. MPPCA is based upon the mixture of Factor Analyzers (MFA) which is similar to MPPCA except is uses Factor Analysis to estimate the covariance matrix. This algorithm is of interest to me because it is relevant to my Thesis work. Although MPPCA is presented as an improved variant of the mixture of Gaussians model, it may be better thought of as a form of covariance estimation. In covariance estimation, the covariance matrix is modified using a technique other than the standard XXT method of computing covariance of a data set where X = [x1, x2, . . . , xN ]. Furthermore MPPCA can be viewed as a form of covariance regularization which is a super-set of covariance estimation. Covariance regularization includes methods like shrinkage where the empirical covariance matrix is modified to be non-singular [7]. Also a part of covariance regularization is a common technique for computing a mixture of Gaussians (MoG) in high dimensions: constraining the covariance matrix to be diagonal. Historically, covariance regularization methods have been employed out of necessity where the training data is too few to compute a non-singular covariance matrix. In my Thesis, however, we see that covariance regularization techniques have a usefulness beyond making the data non-singular – that they can be beneficial to classification when used as a part of a MoG model and even when not necessary! An evaluation of MPPCA as a covariance regularization method would build upon my Thesis work in that it is a popular algorithm (with respect to the number of citations), derived by a respected figure in the machine learning community (and author of our course textbook), and is a form of MoG with covariance regularization. In addition my Thesis did not compare the performance of a covariance estimation, covariance regularization algorithm like MPPCA where the estimated covariance matrix is not merely modified, as with shrinkage, but is completely re-estimated using PCA. Personally, adding to the intrigue of this algorithm is that, despite its well-known status in the community, I have been unable to find a single, published result using this algorithm outside the publication that introduced it. Finally, an understanding of this algorithm’s performance and capabilities could be beneficial to, not only the machine learning community, but other communities which seek better techniques for analyzing data sets [6].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS 545 Assignment 8 A Survey on Approaches to Improve kNN Classification Performance

3.1 Vector M odel ............................................................................................. 4 3.2 Terms W eighting ....................................................................................... 4 3.3 Similarity M easuring ................................................................................. 5 4 Improving kNN Performance using Different Term Weighting Sch...

متن کامل

CS 545 : Assignment 8 A Scaled Conjugate Gradient Implementation for R Andy

3 Implementation Details 3 3.1 R Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2 C Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2.1 Efficient Vector Manipulation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2.2 The Core SCG Implementation . . . . . . . . . ...

متن کامل

CS 545 : Assignment 8

A Self-Organizing Map (SOM) is a type of artificial neural network used to transform a data set of vectors into a set of lower dimensional vectors. The applications of this are wider than one might expect, but this is the heart of its purpose. The data are usually some set of vectors {x ∈ Rn}, but a SOM can work with any set of vectors that have a well-defined distance metric. The canonical exa...

متن کامل

CS 545 Project ParamNN: A Parameter Prediction Neural Network

3 ParamNN 2 3.1 Randomly Generated Seed Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2 Predicted Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.3 Best Predicted Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.4 Jittered Best Predicted Parameters . . . . . . . . . . . . . . ...

متن کامل

ar X iv : 0 80 6 . 32 58 v 3 [ cs . D S ] 5 S ep 2 00 8 Local Search Heuristics for the Multidimensional Assignment Problem

The Multidimensional Assignment Problem (MAP) (abbreviated sAP in the case of s dimensions) is an extension of the well-known assignment problem. The most studied case of MAP is 3-AP, though the problems with larger values of s have also a number of applications. In this paper we propose several local search heuristics for MAP and select the dominating ones. The results of computational experim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009